Dynamical order and many-body correlations in zebrafish show that three is a crowd

Zebrafish constitute a convenient laboratory–based biological system for studying collective behavior. It is possible to interpret a group of zebrafish as a system of interacting agents and to apply methods developed for the analysis of systems of active and even passive particles. Here, we consider the effect of group size. We focus on two– and many–body spatial correlations and dynamical order parameters to investigate the multistate behavior. For geometric reasons, the smallest group of fish which can exhibit this multistate behavior consisting of schooling, milling and swarming is three. We find that states exhibited by groups of three fish are similar to those of much larger groups, indicating that there is nothing more than a gradual change in weighting between the different states as the system size changes. Remarkably, when we consider small groups of fish sampled from a larger group, we find very little difference in the occupancy of the state with respect to isolated groups, nor is there much change in the spatial correlations between the fish. This indicates that fish interact predominantly with their nearest neighbors, perceiving the rest of the group as a fluctuating background. Therefore, the behavior of a crowd of fish is already apparent in groups of three fish.


REVIEWER COMMENTS
Reviewer #1 (Remarks to the Author): I liked this study and think it will make a substanfial impact to understanding the collecfive behaviour of animals in general (not just zebrafish, or even just fish).While the approach taken is one from the physical sciences, the findings will be of widespread interest to biologists studying collecfive behaviour.This is because we typically assume that inter-individual interacfions must change in larger groups (as there are more individuals to interact with) and result in different collecfive behaviours.While this effect is likely to saturate at large group sizes (a fish is unlikely to be able to tell if it's in a group of 100 or 1000), showing that collecfive behaviour in large groups is already seen in groups of only 3 is not what I would predict.It is excellent to see that this is done with 3D tracking, as most studies on fish collecfive behaviour have approximated the system to 2D (with fish in shallow water) for ease of data collecfion.I found the paper to be very well wriften, being clear and well-explained enough that it should be understandable to those with a background in biology while sfill safisfying those from the physical sciences.
I thought the abstract clearly lays out the approach and findings.However, I found the first three sentences didn't make the work sound important enough and with widespread appeal to jusfify publicafion in a high impact journal with a broad readership.The authors could, for example (there are other ways to do this), first state how collecfive movement is key to group living in a diverse range of animals, then how much research has been dedicated to deciphering the inter-individual interacfions within groups, and then establish the research quesfion of how these interacfions scale with group size, stafing that this is poorly understood.
At the start of the "Brief methodology", it would be useful to explain why a "a bowl shaped tank with parabolic secfion" was used, especially given that the authors later demonstrate that the geometry within which the fish can move is important.
Please explain briefly but explicitly what "lab reference frame" and "reference frame of fish i" are; this could be done in the figure 2  I think the label in the figure 4 legend, "(d) Probability distribufion of the three-fish bond angle", should be labelled (c)?Also, further explanafion of what "three-fish bond angle" is would be worth adding.
It would benefit the paper to include in the Conclusion a discussion of the biological implicafions of the findings, especially "nearest neighbor interacfions dominate and the other fish are less important".This would only require a paragraph, but could include the importance for predafion risk, as this result implies that during predatory aftacks that disturb the structure of large groups, the fish do not need to change the neighbours that they are interacfing with.For example, for an analysis of fish collecfive behaviour during aftacks from predators, see i.e. whether the lack of longer-range interacfions is due to a sensory or cognifive constraint of the fish.
In the Supplementary Informafion, "not an actual phase transifions" needs correcfing.

University of Bristol
Reviewer #2 (Remarks to the Author): This manuscript provides nice methodological expansions of our understanding of collecfive behavior of fish.However, while there are several nice aspects provided, I have concerns about the methods and experimental design that prevent me to recommend acceptance for publicafion in its current form.
1. Sample size for each 2, 3, 4 and 50 group size is not provided.Further, the authors state that the same fish were sampled repeatedly ("brief methodology secfion").This would mean the data contains either a lot of pseudo-replicates (if group sizes would have been sampled more than once) or no replicate at all (if each group size has been sampled only once) -both is problemafic.
2. How are fish idenfified during tracking and during preceding and subsequent handling?In the methods secfion, the authors say "Typically, three fish (fish A, B, and C) will be transferred to a temporary tank.The fish A and B will be introduced to the observafion tank, where we carried out the first two-fish observafion.Then fish B will be take back to the temporary tank, while fish C being introduced to the observafion tank, so that the second two-fish experiment could be carried out.Finally, all the fish were placed in the observafion tank, and we perform the three-fish observafion."This is only possible when fish would have been idenfified and IDs were traced throughout the experiment.Furthermore, this procedure would treat each of the fish in the group differenfially thus introducing confounding variafion among the three fish.How often was each fish repeatedly tested and how long was a recording session?
How were fish IDs kept during tracking?As far as I understand, the 3D tracking has been done by combining 3 2D-tracks.This would mean that fish IDs in all three 2D tracks must have been traced throughout the recording period -for somefimes 50 individuals!Please provide proof that this has been achieved.If IDs regularly jump or switch, I don't see how the data then can be used to inform any model or interpretafion.
3. The simulafion model assumes constant speed "We assume that the fish move in 2D with a constant speed v0 and can only change their velocity orientafion given by the angle i (Fig. 2  The authors invesfigate the mofion of groups of Zebrafish of sizes 2, 3, 4, and 50 and idenfify three collecfive pafterns: schooling, milling, and swarming.They discuss many-body spafial correlafion and group size effects.The study involves experiments and agent-based models.They conclude that Zebrafish interact with their nearest neighbors and that groups of 3 already exhibit the three collecfive pafterns menfioned above.This is a problem of fimely interest, but there are various issues that need to be clarified/addressed. 1.There are several works on the collecfive mofion of animal systems that are comparable to this one.I could not find what disfinguishes this study from previous ones.What is the relevant, disfinct take-home message here?2. It is not clear to me how parameters have been selected.How many parameters are there?Is there a quanfitafive fair comparison between the model and experiments?3. Are there alternafive models/explanafions? Small groups of animals were analyzed with simpler models.As in this study, it was found that the mofion of individuals is fundamentally given by an aftracfion mechanism towards the nearest neighbors in the field of view and that leads to a strong velocity correlafion among the group members.Interacfions are strongly non-reciprocal and there is a font-back asymmetry.See Gomez-Nava et al.Nature Physics 18, 1494-1501 (2022).In this work, intermiftent behavior refers to the fact the group moves and stops.Can one speculate that a model with fewer parameters can also reproduce Zebrafish data? 4. In previous fish models (see Turnstrom et al.PLoS Comp.Biol.9, e1002925 (2013)) was shown that spontaneous transifions between schooling, milling, and swarming, without varying parameter values, occur.Is this also occurring here?

Response to reviewer one
We thank the reviewer for kindly taking the trouble to read our manuscript and especially for their very positive assessment of our work.We have considered carefully the criticism of the reviewer and we have implemented their helpful suggestions in our revised manuscript.We hope that the reviewer finds our revised manuscript to be suitable for publication in Nature Communications.
Reviewer #1 (Remarks to the Author): I liked this study and think it will make a substantial impact to understanding the collective behaviour of animals in general (not just zebrafish, or even just fish).While the approach taken is one from the physical sciences, the findings will be of widespread interest to biologists studying collective behaviour.This is because we typically assume that inter-individual interactions must change in larger groups (as there are more individuals to interact with) and result in different collective behaviours.While this effect is likely to saturate at large group sizes (a fish is unlikely to be able to tell if it's in a group of 100 or 1000), showing that collective behaviour in large groups is already seen in groups of only 3 is not what I would predict.It is excellent to see that this is done with 3D tracking, as most studies on fish collective behaviour have approximated the system to 2D (with fish in shallow water) for ease of data collection.I found the paper to be very well written, being clear and well-explained enough that it should be understandable to those with a background in biology while still satisfying those from the physical sciences.
I thought the abstract clearly lays out the approach and findings.However, I found the first three sentences didn't make the work sound important enough and with widespread appeal to justify publication in a high impact journal with a broad readership.The authors could, for example (there are other ways to do this), first state how collective movement is key to group living in a diverse range of animals, then how much research has been dedicated to deciphering the inter-individual interactions within groups, and then establish the research question of how these interactions scale with group size, stating that this is poorly understood.
We thank the reviewer for their kind and highly positive assessment of our manuscript.We are also grateful for the constructive criticism that they offer here and have restructured the introduction accordingly.
At the start of the "Brief methodology", it would be useful to explain why a "a bowl shaped tank with parabolic section" was used, especially given that the authors later demonstrate that the geometry within which the fish can move is important.This geometry was selected because we want to measure the 3D behaviour of the zebrafish, since the fish naturally swim in 3D in rivers [Shelton, et al Zebrafish 17, no. 4 (2020)].To capture the 3D movement of the fish with a three camera setup, we need all three cameras to view the fish without any obstacles.Therefore, the container should not be a box, in which the corners will be invisible for some cameras.We have made this point clearly in the revised manuscript.
Please explain briefly but explicitly what "lab reference frame" and "reference frame of fish i" are; this could be done in the figure 2 legend.
Thank you for this comment.We have clarified what we mean by laboratory and fish reference frames in the caption to Fig. 2.

Fig. 3 legend: It wasn't clear to me what the grey contour lines behind the fish trajectories represented.
Thank you for this comment -the grey contours have been clarified in the caption.
I think the label in the figure 4 legend, "(d) Probability distribution of the three-fish bond angle", should be labelled (c)?Also, further explanation of what "three-fish bond angle" is would be worth adding.
Thank you for kindly pointing out this typo, it has been fixed.

It would benefit the paper to include in the Conclusion a discussion of the biological implications of the findings, especially "nearest neighbor interactions dominate and the other fish are less important". This would only require a paragraph, but could include the importance for predation risk, as this result implies that during predatory attacks that disturb the structure of large groups, the fish do not need to change the neighbours that they are interacting with. For example, for an analysis of fish collective behaviour during attacks from predators, see Romenskyy, Maksym, et al. "Quantifying the structure and dynamics of fish shoals under predation threat in three dimensions." Behavioral Ecology 31.2 (2020): 311-321 Also, it is worth commenting on the sensory basis of collective behaviour, e.g. Pita, Diana, et al. "Vision in two cyprinid fish: implications for collective behavior." PeerJ 3 (2015): e1113, i.e. whether the lack of longer-range interactions is due to a sensory or cognitive constraint of the fish.
Thank you for this suggestion.We have included a paragraph in the conclusions, which we believe goes towards addressing the biological implications of our findings.

In the Supplementary Information, "not an actual phase transitions" needs correcting.
Thank you -this has been addressed.

Reviewer #2 (Remarks to the Author):
We thank the reviewer for their careful reading of our manuscript and for their thoughtful and constructive criticism.We are also grateful for their positive assessment of the potential for our manuscript.We have endeavoured improve the manuscript in line with the reviewer's comments.In particular, we have extended or model to include variable speed in the agents and include the results in our revised manuscript.We hope that this revised manuscript meets the reviewer's expectations and that is suitable for publication in Nature Communications.
This manuscript provides nice methodological expansions of our understanding of collective behavior of fish.However, while there are several nice aspects provided, I have concerns about the methods and experimental design that prevent me to recommend acceptance for publication in its current form.
1. Sample size for each 2, 3, 4 and 50 group size is not provided.Further, the authors state that the same fish were sampled repeatedly ("brief methodology section").This would mean the data contains either a lot of pseudo-replicates (if group sizes would have been sampled more than once) or no replicate at all (if each group size has been sampled only once) -both is problematic.
Thank you for pointing this out.The sample sizes are as described below, and the following text has been added to the methods section of the revised manuscript."The experimental data were sampled as follows.For the two fish experiment, we selected 12 different pairs randomly chosen from a group of 50, and observed each pair for one hour.For the three fish experiment, we selected 6 different triplets from a group of 50, and observed each triplet for one hour.For four fish experiment, we selected 5 different quadruplets from a group of 50, and observed each quadruplet for one hour.The group of 50 itself was observed for one hour." In this way, our repeated measurements of a given group size sampled different fish, rather than taking multiple measurements of the same fish so that our analysis are not specific to any group.We acknowledge the lack of detailed description on the sample size and we have included the information in the updated manuscript.
2. How are fish identified during tracking and during preceding and subsequent handling?In the methods section, the authors say "Typically, three fish (fish A, B, and C) will be transferred to a temporary tank.The fish A and B will be introduced to the observation tank, where we carried out the first two-fish observation.Then fish B will be taken back to the temporary tank, while fish C being introduced to the observation tank, so that the second two-fish experiment could be carried out.Finally, all the fish were placed in the observation tank, and we perform the three-fish observation."This is only possible when fish would have been identified and IDs were traced throughout the experiment.Furthermore, this procedure would treat each of the fish in the group differentially thus introducing confounding variation among the three fish.How often was each fish repeatedly tested and how long was a recording session?
Thank you for pointing out the issue with the identity of the fish and we certainly agree that more clarification would help.We did not use actual markers to identify the fish, and we always choose the fish with no preference.However, our fixed recording procedure enables us to label and track the fish in each individual experiment.Operationally, we carry out the following tasks in a typical experiment.
Since we always perform selection without preference, we do not know exactly how many times one fish was selected.There are 50 fish in total, therefore the probability of one fish being selected twice is 1.4%.We acknowledge that our previous manuscript lack the detail of the experimental procedure.This description has been added to the SI of our revised manuscript.
How were fish IDs kept during tracking?As far as I understand, the 3D tracking has been done by combining 3 2D-tracks.This would mean that fish IDs in all three 2D tracks must have been traced throughout the recording period -for sometimes 50 individuals!Please provide proof that this has been achieved.If IDs regularly jump or switch, I don't see how the data then can be used to inform any model or interpretation.This is a very good point, and we acknowledge that we should have been rather more clear in the manuscript.We do not have a 100% confidence in the fish IDs during the tracking [our confidence is around 90% for a group of 50 fish and higher for smaller groups, see below and Yang et al.PLOS Computational Biology 18 14 (2022)].In some circumstances, this can be a significant issue which limits determination of correlation functions which directly require the IDs being consistently correct, for instance the time correlation function [Nagy et al Nature 464 890 (2010)].However, for the results presented in our current manuscript, we only utilized the positions and velocities of the fish, so the switching of IDs is not a major issue here.
In fact we did not get 3D trajectories by combining 2D trajectories.Instead, we first obtain 3D coordinates of the fish, and then link these locations into trajectories.This association process (to find the same fish in pictures taken from different cameras) can be carried out just by considering the multi-view geometry, as well as the refraction of water.We then followed the existing The linking process is not perfect as we encounter breaking of trajectories, exactly because IDs of the fish are not followed correctly all the time.Even though we are aware of existing software to better handle IDs of individuals, it is hard for us to use such "conventional" 2D tracking software because we used a multiple--camera setup to reconstruct 3D trajectories.In our case, the relative distances between the fish and the cameras change frequently so the size and shape of the fish are not consistent.The changing shapes break the assumption of IdTracker [Pérez-Escudero et al.Nature Methods 11 743 (2014)] and similar software to the best of our knowledge.
To tackle the 3D tracking task, we developed our own tracking system all from scratch, and we briefly mentioned the system in our previous publication [Yang et al.PLOS Computational Biology 18 14 (2022)].We also shared all the related source code on GitHub (https://github.com/yangyushi/FishPy).
We are confident that our camera system and tracking code produce good results, because we tested them with simulated data.As described in the SI of our previous paper [Yang et al. 2022], we simulated the trajectories of 50 Vicsek agents and rendered these agents in simulated conditions mimicking our experimental setup.Knowing the ground truth of the movement of these agents, we re-constructed the movements with our algorithms.By comparing the ground truth with our reconstruction, we learned that the algorithm locates 45 fish out of fifty.The missing fish would cause IDs to change and trajectories to break, but have a limited impact on the calculated order parameters and correlation functions.Furthermore, we limited our analysis to functions that only require the locations and velocities of the fish group, regardless of the fish ID, as shown in equations ( 1) and (2) in the main text.Therefore, the problem of fish being assigned an incorrect ID only enters the calculation indirectly by causing the velocities to be inaccurate.For example, If fish A in time t was assigned to fish B in time t+1, then the only error affecting the analysis is a wrong velocity vector in time t.Since we repeated carried out long observations, we have large enough data to ensure these infrequent error not contributing significantly.
We have addressed this point in the revised manuscript and added a discussion to the SI.
3. The simulation model assumes constant speed "We assume that the fish move in 2D with a constant speed v0 and can only change their velocity orientation given by the angle in (Fig. 2 (a))."It is known that fish use speed adjustments during social interactions and more recent models already include this feature.(Herbert-Read et al. 2011, Herbert-Read et al. 2017, Jolles et al. 2017, Jolles et al. 2020, Klamser et al. 2021).Why isn't variable speed included in the model?Thank you for the comment.It is quite true that including variable speed in the model has the potential to influence our results.And, as the reviewer points out, there are examples in the literature where differences between individuals in biological systems have been studied.We have therefore incorporated these effects into our model, by allowing variable speeds among the agents.We have now performed new simulations with variable speed and analysed these.Under the observables that we use, in fact this does not massively change the outcome.We discuss this new aspect of the study in our revised manuscript in the context of the references the the referee has kindly provided and other work in the literature.
Fig. 3 legend: It wasn't clear to me what the grey contour lines behind the fish trajectories represented.
(a))."It is known that fish use speed adjustments during social interacfions and more recent models already include this feature.(Herbert-Readet al. 2011, Herbert-Read et al. 2017, Jolles et al. 2017, Jolles et al. 2020, Klamser et al. 2021).Why isn't variable speed included in the model?